Menu directory status & updates copyrights help

Overview: Stephen Grossberg 2021 "Conscious Mind, Resonant Brain"

Overview: Stephen Grossberg 2021 "Conscious Mind, Resonant Brain"

"... Indeed, one can explain and predict large amounts of psychological and neurobiological data using a small set of mathematical laws, such as the laws for short-term memory (STM), medium-term memory (MTM), and long-term memory (LTM), and a somewhat larger set of characteristic microcircuits, or modules, that embody useful combinations of functional properties, such as the properties of learning and memory, decision-making, and prediction. Thus, just as in physics, only a few basic laws, or equations, are used to explain and predict myriad facts about mind and brain, when they are embodied in modules that may be thought of as the "atoms" or "molecules" of intelligence.

Specializations of these laws in variations of these modules are then combined into larger systems that I like to call modal architectures, where the word 'modal' stands for different modalities of intelligence, such as vision, speech, cognition, emotion, and action. Modal architectures are less general than a general-purpose von Neumann computer, but far more general than a traditional AI algorithm. Modal architectures clarify, for example, why we have the five senses of sight, sound, touch, smell, and taste, and how they work. Continuing with the analogy from physics, modal architectures can be compared with macroscopic objects in the world.

These equations, modules, and modal archotectures underlie unifying theoretical principles and mechanisms of all the brain processes that this book will discuss, and that my stories will summarize. ..."

(Grossberg 2021 "Conscious mind, resonant brain" Oxford University Press, page xi)

Table of Contents



Introduction

This is NOT a review of Grossberg's "Conscious mind: Resonant brain" book. Instead, it presents my own questions and why I have selected Grossberg's concepts in preference to countless other theories of consciousness.

As described on the Introduction webPage, questions driving this "webSite" (collection of webPages, defined by the menu above) are :
  1. Do "Large Language Models (LLMs) (such as chatGPT, LaMDA, etc) already exhibit a [protero, incipient] consciousness, in particular given the rough similarity of the basic unit of "Transformer Neural Networks" (TrNNs) to one of Grossberg's general "modules". The latter are proposed as a small number of units that are readily recombined with slight modifications as a basis for much of brain architecture, much like the small number of concepts in physics can be applied across a broad range of themes.
  2. How difficult would it be to augment "Transformer Neural Networks" (TrNNs) with Grossberg's [concept, architecture]s, including the emergent systems for consciousness? Perhaps this would combine the scalability of the former with the [robust, extendable] foundations of the latter, which is supported by [broad, diverse, deep] data from [neuroscience, psychology], as well success in real world advanced [science, engineering] applications?
  3. Are current (semi-manual) "controls" of "Large Language Models (LLMs) going in the direction of machine consciousness, and will they ultimately require this, in particular for [learning, evolution] in a stable an robust manner?
Non-priority questions include:
  1. Do TrNNs generate at least some of the "principles" from [Minsky grammar, Grossberg brain]????? If not, what might that say about the [principle, TrNN]s? How does this relate to the Artificial Intelligence (AI) [rational logical, scientific] basis of thinking, versus Computational Intelligence non-[rational logical, scientific] basis of thinking? What about [statistic, information theoretic] versus connectionist appraoches? Of course all combinations are permissible, from logic to CI.
  2. What is the [status, trend] of [transfer learning, modularization] of [RNN, CNN, TrNN]-based deep learning models, and how is that impacting on the amount of ????????????
  3. Can TrNNs provide a means of rapid-building "conceptual models" like Minsky's grammar and Grossberg's ART?
  4. Can "conceptual models" like Minsky's grammar and Grossberg's ART provide more profound explanations of what TrNNs are doing?
  5. Does reliance on conventional [RNN, CNN, TrNN] architectures using conventional [statistic, information theoretic, machine learning] mean that, in essence, these machines are "thinking" for you? Like the "Universal Function Approximation" (UFA) property of most NNs and many other approaches, does this [retard, prevent, disfavor] longer-term work to better understand complex systems? r is this becoming irrelevant as the machines have better capabilities,and will human understanding become a museum piece for the future?
Are most modern LLMs at least 50 years out of date on foundational concepts for understanding [neuron, brain, consciousness]? If so, that's understandable, as these were NOT drivers of LLM development.

[Grossberg 2021] provides a much better explanation of most of the concepts listed on this webSite (including [concept, history, etc]s outside of the domain of consciousness) than I could ever do. I felt that I could provide most service by simply listing key concepts and "Tables of Contents" for his book, and for each chapter. Hopefully this webPage will aid readers by providing : Perhaps this webPage will also be a handy side-reference for those who have the book and are reading through it. The time it took to retype content forced me to focus on details and pull together what I had read.



Grossberg: why ART is relevant to consciousness in Transformer NNs

This section is repeated in the Introduction webPage.


Stephen Grossberg 08Apr2023
Subject: relevance of Grossberg's conscious-ART concept to Transformer NNs?

"...
Grossberg has shown that ART is the UNIQUE class of neural networks that can AUTONOMOUSLY learn to attend, classify, and correct predictive errors in a changing world that is filled with unexpected events.

Grossberg derived ART using a THOUGHT EXPERIMENT in his 1980 article in Psychological Review called How Does a Brain Build a Cognitive Code. The 2021 book reviews this thought experiment, which uses only a few familiar facts from daily life as the hypotheses that lead to ART. The conclusion that ART is unique is thus hard to contradict.

Many discoveries using ART have continued to be made to the present, including how ART dynamics are realized in laminar neocortical circuits with identified neurons using spiking dynamics, also reviewed in the 2021 book.

Why is this relevant to the Vaswani et al 2017 "Attention is all you need" paper?

It is because ART is currently the most advanced cognitive and neural theory about how humans and machines can learn to PAY ATTENTION, and how attention dynamically stabilizes recognition learning in ART, thereby solving the CATASTROPHIC FORGETTING problem that afflicts back propagation and Deep Learning.

The fact that ART generates feature-category resonances, that explain and simulate lots of psychological and neurobiological data about CONSCIOUS recognition, also makes it highly relevant to AI efforts to design "sentient" algorithms.

Grossberg has also shown how other resonances support conscious seeing, hearing, and feeling, and characterizes the kinds of attention that occur during these events.

Every AI practitioner who is interested in attention and consciousness should thus study ART, if only to avoid reinventing the wheel.
..."




Grossberg's [non-linear DEs, CogEm, CLEARS, ART, LAMINART, cART] models

1958-59 ?among the world's first? systems of non-linear differential equations for NNs

Grossberg's development of neural network systems of non-linear dynamical equations is especially important, as it presaged its use by almost all modern NNs in one form or another. This started with a 1957-58 high school project, and following improvements over time was first published in ?1967?. Strangely, that is the same year of the famous Minsky-Papert paper that claimed the existing neural networks, generally limited to a single layer?, would not be adequate for solving general problems.

For sure, most models don't just use non-linear DEs, but a mix of other tools as well.

Which NN models DO use non-linear DEs? Which NN models DON'T use non-linear DEs? (see Grossberg 2021] pages ??-?? for descriptions of many of these)

?date? CogEm Cognitive-Emotional model

?date? CLEARS [Cognition, Learning, Expectation, Attention, Resonance, Synchrony]

1976 ART Adaptive Resonance Theory

?date? LAMINART Laminar computing ART

?date? cART conciousness ART

How many other theories of consciousness have evolved directly from a background like this?



The underlying basis in [bio, psycho]logical data

See [Grossberg 2021] for references to papers about experiments.

Rather than collect a listing of [experiments, data] that has been used to [develop, test, evolve] Grossberg's concepts, it is much easier to extract [chapter, section, sub-section] titles with keywords. Obviously Large language Models (LLMs) such as [LaMDA, BARD, chatGPT, etc] would make for an interesting comparison, but it all would be better to do searches on the full text of the book.


+-----+
[bio, neuro, psycho]logy data
grepStr='data|monkey|sea urchin|Slime mold|slug|biology|Psychological|neurophysiological|perceptual'

A universal development code p618 - Mental measurements embody universal laws of cell biology and physics
I will have to read the book several times to get this all to sink in, notwithstanding 3 decades of reading a small subset of Grossberg's [book, paper]s. Ultimately, though, only working with code and data goes far enough. My re-tying of this material will have errors.

  • Cooperation and comptition are universal in biology, including in brain networks 70
  • A perceptual disaster: Uncontrolled filling-in 144
  • Neurophysiological data support end cut predictions 147
  • Explaining texture data with the double filter 156
  • Neurobiological data for ART matching my corticogeniculate feedback 193
  • Neurobiological data for ART matching in visual, auditory, and somatosensory cortex 194
  • Additional brain data about attentive category learning and orienting search 229
  • Explaining human categorization data with ART: Learning rules-plus-exceptions 241
  • Many kinds of psychological and neurobiological data have been explained by ART 249
  • Explaining data about visual neglect: Coordinates, competition, grouping, and action 256
  • Explaining data about visual crowding and situational awareness 259
  • Towards a unified explanation of data about crowding, visual search, and neglect 261
  • Explaining data about change blindness and motion-induced blindness: No map? 262
  • Explaining many data with the same model mechanisms 264
  • Human and monkey data support shroud reset properties: Explanations and predictions 267
  • Target swapping data: Why catastrophic forgetting of invariant categories does not happen 271
  • Two types of perceptual stability cooperate during active conscious vision 276
  • Data supporting the MT-MSTv prediction 316
  • Pyschological and neurophysiological data supporting feedback during motion capture 318
  • Converting motion into action during perceptual decision-making 331
  • Probabilistic motion-based decision-making in monkeys 333
  • Perceptual grouping and attention: Interactions, similarities, and differences 355
  • Analog coherence: solution of the binding problem for perceptual grouping without loss of analog sensitivity 356
  • Fast feedforward processing when data are unambiguous 356
  • Laminar mechanisms of preattentive perceptual grouping 358
  • A unified view of developmental, neurophysiological, and perceptual processes and data 366
  • When attention is not needed to learn: Perceptual learning without awareness 366
  • Dense RDS that induce perceptual completion of partially occluded objects 388
  • Explaining stimulus rivalry and eye rivalry data in a unified way: Habituation again! 398
  • Some simulated auditory streaming data 424
  • Mechanistic unifications of psychologically diverse 435
  • Psychological and neurophysiological data support predicted working memory properties 441
  • Data supporting MTM by activity-dependent habituative gates 471
  • Adaptive resonance in lexical decision tasks: Error rate vs. reaction time data 472
  • Explaining chunk data from the tachistoscopic condition using ART 474
  • Explaining data from the reaction time condition using ART: List item error trade-off 474
  • Transmitter acculmulation and release: Infinity does not exist in biology! 500
  • Some classical data about sequential dependencies influencing future choices 528
  • RTs in behavioural data and simulations about object and spatial searches 534
  • Paradoxical memory consolidation data follow three obvious behavioural facts 553
  • Data and model simulations of grid cell properties along the dorsoventral axis of MEC 592
    Mental measurements embody universal laws of cell biology and physics
  • A universal development code for biology: Computing with cellular patterns 625
  • Blastula to gastrula in the sea urchin 631
  • Slime mold aggregation and slug motion 634


    The following "questions" indirectly reflect the type of underlying data that has been used. The list below was retyped from the book, and will have errors.

    This list was manually re-typed from [Grossberg 2021, page xix].

    [Grossberg 2021, page xx] follows the list with :
    "... In order to make the chapters self-contained, I review some model properties each time they occur, even if they appear in more than one chapter. This has a deeper purpose than providing self-contained chapters. It clarifies how a small number of brain mechanisms are used in specialized forms in multiple parts of our brains to realize psychological functions that appear in our daily lives to be quite unrelated.
    ..."


    >>>add list from section below "Many kinds of psychological and neurobiological data have been explained by ART"


    Comparision of rivalry models for 3D vision with binocular rivalry and stable vision

    Author Levelt (1967) data Muller & Blake (1989) data Does both: [eyes, stimulus] rivalry Explains patchy percepts Explains rivalry from normal 3-D vision* Explains rivalry-based V1 modulation Uses visual input patterns
    Matsuoka (1984) No No No No No No No
    Mueller (1990) No Yes No No No No No
    Laing & Chow (2002) Yes No slope simulation No No No No No
    Stollenwerk & Bode (2003) Yes Only CC paradigm No Yes No No No
    Wilson (2003) Claims it would work if noise added No Yes No No No No
    Freeman (2005) Partially (very long dominance durations) Only CC paradigm Yes No No No No
    Lankheet (2006) Yes No No No No No No
    Grossberg etal Yes Yes Yes Yes Yes Yes Yes
    Figure 11.34 A comparison of the properties of other rivalry models with those of the 3D LAMINART model (green background). Significantly, only 3D LAMINART explains both stable vision and rivalry (red background) [Grossberg 2021 p395]



    Early ARTMAP benchmark studies

    These are now dated (from 1991-1997), given the performance of Deep Learning and Transformer NNs. Used in applications where other algorithms fail
    eg Boeing CAD group technology : (Grossberg 2021, Figure 5.33, page 225)



    Credibility from non-[bio, psycho]logical applications of Grossberg's ART

    1. airplane design
    2. medical database diagnosis and prediction
    3. remote sensing and geospatial mapping and classification
    4. multidimensional data fusion
    5. classification of data from artificial sensors with high dynamic noise and dynamic range
      (synthetic aperture radar, laser radar, multi-spectral infra-red, night vision)
    6. speaker-normalized speech recognition
    7. automatic rule extraction and hierachical knowledge discovery
    8. machine vision and image understanding
    9. mobile robot controllers
    10. satellite remote sensing image classification
    11. sonar classification
    12. musical analysis
    13. electrocardiogram wave recognition
    14. prediction of protein folding secondary structure
    15. strength prediction for concrete mixes
    16. tool failure monitoring
    17. chemical analysis from ultraviolet and infrared spectra
    18. design of electromagnetic systems
    19. face recognition
    20. familiarity discrimination
    21. power transmission line losses
    (from [Grossberg 2021], p13c1h0.6, http://techlab.bu.edu no longer available)
    As stated in [Grossberg 2021 p13c1h1.0] : "... This range of applications is possible because ART models embody general-purpose properties that are needed to solve the stability-plasticity dilemma in many different types of environments. In all these applications, insights about cooperative-competitive dynamics also play a critical role. ..."

    >>> add content of subSection "Multiple applications of ART to large-scale problems in engineering and technology"

    Granted, the lists above don't tell you which may still be competitive with modern toolsets (especially deep learning), and in which contexts. Nor whether TrNNs and deep learning may turn out to be "quick tools" that ART can greatly improve. What is the commercial uptake in such applications? For sure, the most [popular, successful] modern NNs do NOT use Grossberg models, instead relying on gradient descent, statistics, information theoretics (which I like to consider sepately from statistics), and many other techniques, including classical Artificial Intelligence (which is very different to me from Computational Intelligence), and [logic, procedural, object-oriented] normal programming.


    +-----+
    Here is a recent reference to ART for engineering applications :

    John Seiffertt Aug2018 "Adaptive Resonance Theory in the time scales calculus" Neural Networks,
    Volume 120, Pages 32-39, ISSN 0893-6080, https://doi.org/10.1016/j.neunet.2019.08.010.https://www.sciencedirect.com/science/article/pii/S0893608019302278

    Abstract: Engineering applications of algorithms based on Adaptive Resonance Theory have proven to be fast, reliable, and scalable solutions to modern industrial machine learning problems. A key emerging area of research is in the combination of different kinds of inputs within a single learning architecture along with ensuring the systems have the capacity for lifelong learning. We establish a dynamic equation model of ART in the time scales calculus capable of handling inputs in such mixed domains. We prove theorems establishing that the orienting subsystem can affect learning in the long-term memory storage unit as well as that those remembered exemplars result in stable categories. Further, we contribute to the mathematics of time scales literature itself with novel takes on logic functions in the calculus as well as new representations for the action of weight matrices in generalized domains. Our work extends the core ART theory and algorithms to these important mixed input domains and provides the theoretical foundation for further extensions of ART-based learning strategies for applied engineering work.
    Keywords: Machine learning; Adaptive resonance theory; Unsupervised learning; Control theory; Time scales